Contextual Sentence Decomposition

نویسندگان

  • Elmar Haussmann
  • Hannah Bast
  • Martin Riedmiller
چکیده

In this thesis, we introduce and study contextual sentence decomposition, which, intuitively, decomposes a given sentence into parts that semantically “belong together”. For example, a valid decomposition of the sentence “Usable parts of rhubarb include the edible stalks and the medicinally used roots, however its leaves are toxic” are the sub-sentences “Usable parts of rhubarb include the edible stalks”, “Usable parts of rhubarb include the edible stalks” and “however its leaves are toxic”. Our motivation for this problem comes from semantic full-text search. For a query plant edible leaves, semantic full-text search returns passages where instances of a plant, such as “rhubarb” (and not the word “plant”), are mentioned along with the words “edible” and “leaves”. One of the results this query might erroneously return is the original sentence above. With contextual sentence decomposition we avoid this false-positive, while at the same time maintaining the true factual contents of the original sentence. We propose two approaches for our problem, one based on a set of rules and one using machine learning. On a manually assembled ground truth, we achieve an F-measure of about 65 percent for the former and of 40 percent for the latter. For the semantic full-text search based on these approaches, evaluated on the English Wikipedia (27 GB of raw text), we achieve improvements nearly doubling the F-measure for some queries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Open Information Extraction via Contextual Sentence Decomposition1

We show how contextual sentence decomposition (CSD), a technique originally developed for high-precision semantic search, can be used for open information extraction (OIE). Intuitively, CSD decomposes a sentence into the parts that semantically “belong together”. By identifying the (implicit or explicit) verb in each such part, we obtain facts like in OIE. We compare our system, called CSD-IE, ...

متن کامل

Modelling pause duration as a function of contextual length

Effects of contextual length are known to affect pause durations in neutral speech. The present study investigates these effects on an expressive corpus of read tales in French. Computational models of intra-sentence, and inter-sentence pause durations, as functions of contextual lengths are proposed. These models are aimed at improving Text-To-Speech synthesis systems, and provide clues for sy...

متن کامل

An effective sentence-extraction technique using contextual information and statistical approaches for text summarization

This paper proposes an effective method to extract salient sentences using contextual information and statistical approaches for Text Summarization. The proposed method combines two consecutive sentences into a bi-gram pseudo sentence so that contextual information is applied to statistical sentence-extraction techniques. Salient bigram pseudo sentences are first selected by the statistical sen...

متن کامل

Using determiners as contextual cues in sentence comprehension: A comparison between younger and older adults

Younger adults use both semantic and phonological cues to quickly and efficiently localize the referent during sentence comprehension. While some behavioral studies suggest that older adults use contextual information even more strongly than younger adults, ERP studies have shown that this population, as a group, is less apt at using contextual semantic cues to predict upcoming words. The curre...

متن کامل

Homographic Ideogram Understanding Using Contextual Dynamic Network

Conventional methods for disambiguation problems have been using statistical methods with co-occurrence of words in their contexts. It seems that human-beings assign appropriate word senses to the given ambiguous word in the sentence depending on the words which followed the ambiguous word when they could not disambiguate by using the previous contextual information. In this research, Contextua...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012